Author name disambiguation: What difference does it make in author-based citation analysis?

نویسندگان

  • Andreas Strotmann
  • Dangzhi Zhao
چکیده

In this paper, we explore how strongly author name disambiguation (AND) affects the results of an author-based citation analysis study, and identify conditions under which the commonly used simplified approach of using surnames and first initials may suffice in practice. We compare author citation ranking and co-citation mapping results in the stem cell research field 2004-2009 between two AND approaches: the traditional simplified approach of using author surnames and first initials, and a sophisticated algorithmic approach. We find that the traditional approach leads to extremely distorted rankings and substantially distorted mappings of authors in this field when based on firstor all-author citation counting, whereas last-author based citation ranking and co-citation mapping both appear relatively immune to the author name ambiguity problem. This is largely because romanized names of Chinese and Korean authors, who are very active in this field, are extremely ambiguous, but few of these researchers consistently publish as last authors in by-lines. We conclude that more earnest effort is required to deal with the author name ambiguity problem in both citation analysis and information retrieval, especially given the current trend towards globalization. In the stem cell field, where lab heads are traditionally listed as last authors in by-lines, last-author based citation ranking and co-citation mapping using the traditional simple approach to author name disambiguation may serve as a simple workaround, but likely at the price of largely filtering out Chinese and Korean contributions to the field as well as important contributions by young researchers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بهبود صحت ابهام‌زدایی نام نویسنده با استفاده از خوشه‌بندی تجمّعی

Today, digital libraries are important academic resources including millions of citations and bibliographic essential information such as titles, author's names and location of publications. From the view of knowledge accumulation management, the ability to search fast, accurate, desired contents, has a great importance. The complexity and similarity in these resources cause many challenges and...

متن کامل

A Real-time Heuristic-based Unsupervised Method for Name Disambiguation in Digital Libraries

This paper addresses the problem of name disambiguation in the context of digital libraries that administer bibliographic citations. The problem occurs when multiple authors share a common name or when multiple name variations for an author appear in citation records. Name disambiguation is not a trivial task, and most digital libraries do not provide an efficient way to accurately identify the...

متن کامل

"I Cannot Tell What the Dickens His Name Is": Name Disambiguation in Institutional Repositories

INTRODUCTION Authors who publish under more than one form of their name, multiple authors with the same name, and incomplete author information can all create challenges for repository staff when entering metadata. Unless properly addressed, these variations and duplications can result in search and retrieval errors for users. Name disambiguation, the process of identifying, merging, and making...

متن کامل

Reducing Fragmentation in Incremental Author Name Disambiguation

Author name ambiguity is a hard problem that occurs when several authors publish articles with the same name or when a same author publishes their articles under different names. Traditionally, automatic disambiguation methods process the author names of all citation records in a repository. Aiming efficiency, incremental methods disambiguate author names only when new citation records are inse...

متن کامل

A tool for generating synthetic authorship records for evaluating author name disambiguation methods

0020-0255/$ see front matter 2012 Elsevier Inc http://dx.doi.org/10.1016/j.ins.2012.04.022 ⇑ Corresponding author at: Departamento de Ciên E-mail addresses: [email protected] (A.A. F dcc.ufmg.br (A.H.F. Laender), [email protected] 1 Here regarded as a set of bibliographic informati particular article. The author name disambiguation task has to deal with uncertainties related to the possib...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JASIST

دوره 63  شماره 

صفحات  -

تاریخ انتشار 2012